Add graph validation and statistics logging for debugging graph construction by princekumarlahon · Pull Request #519 · mllam/neural-lam

princekumarlahon · 2026-03-25T18:36:01Z

Describe your changes

This PR introduces lightweight graph validation and diagnostic utilities to improve debugging during graph construction.

It adds two helper functions:

validate_graph
Performs sanity checks on graph structure (shape, empty edges, invalid indices) and fails early with clear error messages.
compute_graph_stats
Logs useful statistics about the graph, including number of nodes, edges, degree distribution, and isolated nodes.

These utilities are integrated into the graph creation pipeline for:

grid-to-mesh (g2m)
mesh-to-grid (m2g)
mesh-to-mesh (m2m, per level)

Motivation and context

While working with graph construction, it can be difficult to quickly verify whether a generated graph is valid or understand its structure.

This change helps by:

catching invalid graphs early
providing quick visibility into connectivity patterns
making debugging easier when experimenting with graph configurations

Dependencies

No new dependencies are introduced.

Issue Link

closes #518

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

My branch is up-to-date with the target branch
I have performed a self-review of my code
I have added docstrings for new/modified functions
I have added inline comments where necessary
I have updated the README (not required for this change)
I have added tests (validated locally with synthetic graphs)
I have given the PR a clear and descriptive name
Reviewer/assignee will be added by maintainers if needed

Checklist for reviewers

the code is readable
the code is well tested
the code is documented (including return types and parameters)
the code is easy to maintain

Author checklist after completed review

Add CHANGELOG entry (will update after review if accepted)

Checklist for assignee

PR is up to date with the base branch
tests pass
PR is assigned to the next milestone
changelog entry is added

princekumarlahon · 2026-03-25T18:36:45Z

Happy to iterate on this if any changes are needed!

kshirajahere

Hey, thanks for your work on this. I found a couple of things that are worth clearing up, and I’ve left inline comments with the details.
The PR also includes an unrelated CustomMLFlowLogger type-hint cleanup in custom_loggers.py.

kshirajahere · 2026-03-26T15:42:45Z

neural_lam/create_graph.py

+
+    import torch
+
+    degrees = torch.bincount(edge_index[1], minlength=num_nodes)


These stats are using only in-degree (edge_index[1]), but the log messages say "degree" and "isolated nodes" as if they describe the whole graph. For directed graphs like g2m, source-only nodes get counted as isolated even when they have outgoing edges, so the debug output is misleading.

kshirajahere · 2026-03-26T15:46:22Z

neural_lam/create_graph.py

+    if edge_index.min() < 0:
+        raise ValueError(f"[{name}] found negative node indices")
+
+    if edge_index.max() >= num_nodes:


This check is incompatible with the hierarchical m2m graphs below, because from_networkx_with_start_index() intentionally keeps globally offset node ids. For level 1+ edge_index.max() can be much larger than num_nodes even though the graph is valid, so this now breaks hierarchical graph generation.

Thanks for the feedback I've updated the stats to use total degree (in + out), and added separate logging for in-degree and out-degree. Also added a small guard for empty graphs to avoid edge-case issues. Let me know if this looks good!

princekumarlahon added 2 commits March 22, 2026 21:11

Add type hints to CustomMLFlowLogger methods

24e5825

Add graph validation and stats logging utilities

cf9f0a4

Merge branch 'main' into graph-validation

d9c2c8c

kshirajahere reviewed Mar 26, 2026

View reviewed changes

Fix degree calculation and validation logic

8c1a482

Shivampal157 mentioned this pull request Mar 29, 2026

[Test] Assert graph edge indices are in-range for all subgraph tensors (g2m / m2g / m2m / hierarchical) #534

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add graph validation and statistics logging for debugging graph construction#519

Add graph validation and statistics logging for debugging graph construction#519
princekumarlahon wants to merge 4 commits intomllam:mainfrom
princekumarlahon:graph-validation

princekumarlahon commented Mar 25, 2026

Uh oh!

princekumarlahon commented Mar 25, 2026

Uh oh!

kshirajahere left a comment

Uh oh!

kshirajahere Mar 26, 2026

Uh oh!

kshirajahere Mar 26, 2026

Uh oh!

princekumarlahon Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		import torch

		degrees = torch.bincount(edge_index[1], minlength=num_nodes)

Conversation

princekumarlahon commented Mar 25, 2026

Describe your changes

Motivation and context

Dependencies

Issue Link

Type of change

Checklist before requesting a review

Checklist for reviewers

Author checklist after completed review

Checklist for assignee

Uh oh!

princekumarlahon commented Mar 25, 2026

Uh oh!

kshirajahere left a comment

Choose a reason for hiding this comment

Uh oh!

kshirajahere Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

kshirajahere Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

princekumarlahon Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants